Overview

Brought to you by YData

Dataset statistics

Number of variables11
Number of observations7497
Missing cells2711
Missing cells (%)3.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory644.4 KiB
Average record size in memory88.0 B

Variable types

Text1
Categorical5
Numeric4
Boolean1

Alerts

가격(백만원) is highly overall correlated with 모델 and 2 other fieldsHigh correlation
구동방식 is highly overall correlated with 모델 and 1 other fieldsHigh correlation
모델 is highly overall correlated with 가격(백만원) and 2 other fieldsHigh correlation
배터리용량 is highly overall correlated with 가격(백만원) and 2 other fieldsHigh correlation
보증기간(년) is highly overall correlated with 연식(년) and 2 other fieldsHigh correlation
연식(년) is highly overall correlated with 보증기간(년)High correlation
제조사 is highly overall correlated with 가격(백만원) and 2 other fieldsHigh correlation
주행거리(km) is highly overall correlated with 배터리용량 and 2 other fieldsHigh correlation
차량상태 is highly overall correlated with 배터리용량 and 2 other fieldsHigh correlation
사고이력 is highly imbalanced (73.2%) Imbalance
연식(년) is highly imbalanced (52.7%) Imbalance
배터리용량 has 2711 (36.2%) missing values Missing
ID has unique values Unique
보증기간(년) has 618 (8.2%) zeros Zeros

Reproduction

Analysis started2025-01-13 12:08:38.276318
Analysis finished2025-01-13 12:08:43.533613
Duration5.26 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

ID
Text

Unique 

Distinct7497
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size58.7 KiB
2025-01-13T21:08:43.893178image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters74970
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7497 ?
Unique (%)100.0%

Sample

1st rowTRAIN_0000
2nd rowTRAIN_0001
3rd rowTRAIN_0002
4th rowTRAIN_0003
5th rowTRAIN_0004
ValueCountFrequency (%)
train_0000 1
 
< 0.1%
train_0015 1
 
< 0.1%
train_0003 1
 
< 0.1%
train_0004 1
 
< 0.1%
train_0005 1
 
< 0.1%
train_0006 1
 
< 0.1%
train_0007 1
 
< 0.1%
train_0008 1
 
< 0.1%
train_0009 1
 
< 0.1%
train_0010 1
 
< 0.1%
Other values (7487) 7487
99.9%
2025-01-13T21:08:45.024366image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 7497
10.0%
R 7497
10.0%
A 7497
10.0%
I 7497
10.0%
N 7497
10.0%
_ 7497
10.0%
0 3300
 
4.4%
3 3300
 
4.4%
2 3300
 
4.4%
1 3300
 
4.4%
Other values (6) 16788
22.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 37485
50.0%
Decimal Number 29988
40.0%
Connector Punctuation 7497
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3300
11.0%
3 3300
11.0%
2 3300
11.0%
1 3300
11.0%
4 3297
11.0%
5 3200
10.7%
6 3200
10.7%
7 2696
9.0%
8 2199
7.3%
9 2196
7.3%
Uppercase Letter
ValueCountFrequency (%)
T 7497
20.0%
R 7497
20.0%
A 7497
20.0%
I 7497
20.0%
N 7497
20.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7497
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37485
50.0%
Common 37485
50.0%

Most frequent character per script

Common
ValueCountFrequency (%)
_ 7497
20.0%
0 3300
8.8%
3 3300
8.8%
2 3300
8.8%
1 3300
8.8%
4 3297
8.8%
5 3200
8.5%
6 3200
8.5%
7 2696
 
7.2%
8 2199
 
5.9%
Latin
ValueCountFrequency (%)
T 7497
20.0%
R 7497
20.0%
A 7497
20.0%
I 7497
20.0%
N 7497
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 74970
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 7497
10.0%
R 7497
10.0%
A 7497
10.0%
I 7497
10.0%
N 7497
10.0%
_ 7497
10.0%
0 3300
 
4.4%
3 3300
 
4.4%
2 3300
 
4.4%
1 3300
 
4.4%
Other values (6) 16788
22.4%

제조사
Categorical

High correlation 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size58.7 KiB
H사
1237 
B사
1169 
K사
1164 
A사
1142 
T사
1109 
Other values (2)
1676 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters14994
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowP사
2nd rowK사
3rd rowA사
4th rowA사
5th rowB사

Common Values

ValueCountFrequency (%)
H사 1237
16.5%
B사 1169
15.6%
K사 1164
15.5%
A사 1142
15.2%
T사 1109
14.8%
P사 1071
14.3%
V사 605
8.1%

Length

2025-01-13T21:08:45.168526image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-13T21:08:45.306560image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
h사 1237
16.5%
b사 1169
15.6%
k사 1164
15.5%
a사 1142
15.2%
t사 1109
14.8%
p사 1071
14.3%
v사 605
8.1%

Most occurring characters

ValueCountFrequency (%)
7497
50.0%
H 1237
 
8.2%
B 1169
 
7.8%
K 1164
 
7.8%
A 1142
 
7.6%
T 1109
 
7.4%
P 1071
 
7.1%
V 605
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7497
50.0%
Uppercase Letter 7497
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 1237
16.5%
B 1169
15.6%
K 1164
15.5%
A 1142
15.2%
T 1109
14.8%
P 1071
14.3%
V 605
8.1%
Other Letter
ValueCountFrequency (%)
7497
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7497
50.0%
Latin 7497
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 1237
16.5%
B 1169
15.6%
K 1164
15.5%
A 1142
15.2%
T 1109
14.8%
P 1071
14.3%
V 605
8.1%
Hangul
ValueCountFrequency (%)
7497
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7497
50.0%
ASCII 7497
50.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7497
100.0%
ASCII
ValueCountFrequency (%)
H 1237
16.5%
B 1169
15.6%
K 1164
15.5%
A 1142
15.2%
T 1109
14.8%
P 1071
14.3%
V 605
8.1%

모델
Categorical

High correlation 

Distinct21
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size58.7 KiB
ID4
605 
i5
 
414
Niro
 
398
Soul
 
397
i3
 
388
Other values (16)
5295 

Length

Max length6
Median length5
Mean length3.3305322
Min length2

Characters and Unicode

Total characters24969
Distinct characters28
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTayGTS
2nd rowNiro
3rd roweT
4th rowRSeTGT
5th rowi5

Common Values

ValueCountFrequency (%)
ID4 605
 
8.1%
i5 414
 
5.5%
Niro 398
 
5.3%
Soul 397
 
5.3%
i3 388
 
5.2%
RSeTGT 385
 
5.1%
eT 379
 
5.1%
ION6 379
 
5.1%
Q4eT 378
 
5.0%
TayGTS 375
 
5.0%
Other values (11) 3399
45.3%

Length

2025-01-13T21:08:45.429949image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
id4 605
 
8.1%
i5 414
 
5.5%
niro 398
 
5.3%
soul 397
 
5.3%
i3 388
 
5.2%
rsetgt 385
 
5.1%
et 379
 
5.1%
ion6 379
 
5.1%
q4et 378
 
5.0%
taygts 375
 
5.0%
Other values (11) 3399
45.3%

Most occurring characters

ValueCountFrequency (%)
T 3308
 
13.2%
N 1635
 
6.5%
I 1617
 
6.5%
i 1567
 
6.3%
S 1434
 
5.7%
e 1142
 
4.6%
M 1109
 
4.4%
y 1071
 
4.3%
a 1071
 
4.3%
4 983
 
3.9%
Other values (18) 10032
40.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 14966
59.9%
Lowercase Letter 6838
27.4%
Decimal Number 3165
 
12.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 3308
22.1%
N 1635
10.9%
I 1617
10.8%
S 1434
9.6%
M 1109
 
7.4%
O 872
 
5.8%
G 760
 
5.1%
E 734
 
4.9%
X 631
 
4.2%
D 605
 
4.0%
Other values (6) 2261
15.1%
Lowercase Letter
ValueCountFrequency (%)
i 1567
22.9%
e 1142
16.7%
y 1071
15.7%
a 1071
15.7%
o 795
11.6%
r 398
 
5.8%
l 397
 
5.8%
u 397
 
5.8%
Decimal Number
ValueCountFrequency (%)
4 983
31.1%
5 767
24.2%
6 748
23.6%
3 667
21.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 21804
87.3%
Common 3165
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 3308
15.2%
N 1635
 
7.5%
I 1617
 
7.4%
i 1567
 
7.2%
S 1434
 
6.6%
e 1142
 
5.2%
M 1109
 
5.1%
y 1071
 
4.9%
a 1071
 
4.9%
O 872
 
4.0%
Other values (14) 6978
32.0%
Common
ValueCountFrequency (%)
4 983
31.1%
5 767
24.2%
6 748
23.6%
3 667
21.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24969
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 3308
 
13.2%
N 1635
 
6.5%
I 1617
 
6.5%
i 1567
 
6.3%
S 1434
 
5.7%
e 1142
 
4.6%
M 1109
 
4.4%
y 1071
 
4.3%
a 1071
 
4.3%
4 983
 
3.9%
Other values (18) 10032
40.2%

차량상태
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size58.7 KiB
Brand New
3380 
Nearly New
2059 
Pre-Owned
2058 

Length

Max length10
Median length9
Mean length9.2746432
Min length9

Characters and Unicode

Total characters69532
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNearly New
2nd rowNearly New
3rd rowBrand New
4th rowNearly New
5th rowPre-Owned

Common Values

ValueCountFrequency (%)
Brand New 3380
45.1%
Nearly New 2059
27.5%
Pre-Owned 2058
27.5%

Length

2025-01-13T21:08:45.546669image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-13T21:08:46.711888image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
new 5439
42.0%
brand 3380
26.1%
nearly 2059
 
15.9%
pre-owned 2058
 
15.9%

Most occurring characters

ValueCountFrequency (%)
e 11614
16.7%
N 7498
10.8%
r 7497
10.8%
w 7497
10.8%
a 5439
7.8%
5439
7.8%
n 5438
7.8%
d 5438
7.8%
B 3380
 
4.9%
l 2059
 
3.0%
Other values (4) 8233
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47041
67.7%
Uppercase Letter 14994
 
21.6%
Space Separator 5439
 
7.8%
Dash Punctuation 2058
 
3.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11614
24.7%
r 7497
15.9%
w 7497
15.9%
a 5439
11.6%
n 5438
11.6%
d 5438
11.6%
l 2059
 
4.4%
y 2059
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
N 7498
50.0%
B 3380
22.5%
P 2058
 
13.7%
O 2058
 
13.7%
Space Separator
ValueCountFrequency (%)
5439
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2058
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 62035
89.2%
Common 7497
 
10.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11614
18.7%
N 7498
12.1%
r 7497
12.1%
w 7497
12.1%
a 5439
8.8%
n 5438
8.8%
d 5438
8.8%
B 3380
 
5.4%
l 2059
 
3.3%
y 2059
 
3.3%
Other values (2) 4116
 
6.6%
Common
ValueCountFrequency (%)
5439
72.5%
- 2058
 
27.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 69532
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 11614
16.7%
N 7498
10.8%
r 7497
10.8%
w 7497
10.8%
a 5439
7.8%
5439
7.8%
n 5438
7.8%
d 5438
7.8%
B 3380
 
4.9%
l 2059
 
3.0%
Other values (4) 8233
11.8%

배터리용량
Real number (ℝ)

High correlation  Missing 

Distinct194
Distinct (%)4.1%
Missing2711
Missing (%)36.2%
Infinite0
Infinite (%)0.0%
Mean69.397187
Minimum46
Maximum99.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2025-01-13T21:08:46.855253image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum46
5-th percentile46.169
Q156.359
median68.125
Q378.227
95-th percentile96
Maximum99.8
Range53.8
Interquartile range (IQR)21.868

Descriptive statistics

Standard deviation15.283635
Coefficient of variation (CV)0.22023422
Kurtosis-0.96354826
Mean69.397187
Median Absolute Deviation (MAD)11.502
Skewness0.39214255
Sum332134.94
Variance233.58951
MonotonicityNot monotonic
2025-01-13T21:08:47.032013image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 559
 
7.5%
56 327
 
4.4%
46 223
 
3.0%
68.488 202
 
2.7%
76.093 186
 
2.5%
96 136
 
1.8%
99.8 116
 
1.5%
91.2 91
 
1.2%
46.169 88
 
1.2%
93.4 87
 
1.2%
Other values (184) 2771
37.0%
(Missing) 2711
36.2%
ValueCountFrequency (%)
46 223
3.0%
46.09 1
 
< 0.1%
46.13 1
 
< 0.1%
46.15 2
 
< 0.1%
46.169 88
 
1.2%
46.21 1
 
< 0.1%
46.26 1
 
< 0.1%
46.34 1
 
< 0.1%
46.42 1
 
< 0.1%
46.93 1
 
< 0.1%
ValueCountFrequency (%)
99.8 116
 
1.5%
96 136
 
1.8%
95 83
 
1.1%
93.4 87
 
1.2%
92.16 40
 
0.5%
91.2 91
 
1.2%
90 559
7.5%
88.474 7
 
0.1%
88.08 1
 
< 0.1%
87.552 7
 
0.1%

구동방식
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size58.7 KiB
AWD
5167 
FWD
1267 
RWD
1063 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters22491
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAWD
2nd rowFWD
3rd rowAWD
4th rowAWD
5th rowAWD

Common Values

ValueCountFrequency (%)
AWD 5167
68.9%
FWD 1267
 
16.9%
RWD 1063
 
14.2%

Length

2025-01-13T21:08:47.177699image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-13T21:08:47.312233image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
awd 5167
68.9%
fwd 1267
 
16.9%
rwd 1063
 
14.2%

Most occurring characters

ValueCountFrequency (%)
W 7497
33.3%
D 7497
33.3%
A 5167
23.0%
F 1267
 
5.6%
R 1063
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 22491
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
W 7497
33.3%
D 7497
33.3%
A 5167
23.0%
F 1267
 
5.6%
R 1063
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 22491
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 7497
33.3%
D 7497
33.3%
A 5167
23.0%
F 1267
 
5.6%
R 1063
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22491
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
W 7497
33.3%
D 7497
33.3%
A 5167
23.0%
F 1267
 
5.6%
R 1063
 
4.7%

주행거리(km)
Real number (ℝ)

High correlation 

Distinct6916
Distinct (%)92.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44287.979
Minimum3
Maximum199827
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2025-01-13T21:08:47.433635image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile1043.8
Q15465
median17331
Q361252
95-th percentile174270.2
Maximum199827
Range199824
Interquartile range (IQR)55787

Descriptive statistics

Standard deviation55204.064
Coefficient of variation (CV)1.2464796
Kurtosis0.69888503
Mean44287.979
Median Absolute Deviation (MAD)15061
Skewness1.3929303
Sum3.3202698 × 108
Variance3.0474887 × 109
MonotonicityNot monotonic
2025-01-13T21:08:47.598613image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7233 4
 
0.1%
3012 4
 
0.1%
1631 4
 
0.1%
978 4
 
0.1%
9396 3
 
< 0.1%
2571 3
 
< 0.1%
9413 3
 
< 0.1%
3861 3
 
< 0.1%
38158 3
 
< 0.1%
5078 3
 
< 0.1%
Other values (6906) 7463
99.5%
ValueCountFrequency (%)
3 1
< 0.1%
4 2
< 0.1%
6 1
< 0.1%
15 2
< 0.1%
16 1
< 0.1%
26 2
< 0.1%
30 1
< 0.1%
31 1
< 0.1%
32 2
< 0.1%
33 1
< 0.1%
ValueCountFrequency (%)
199827 1
< 0.1%
199819 1
< 0.1%
199818 1
< 0.1%
199766 1
< 0.1%
199760 1
< 0.1%
199647 1
< 0.1%
199515 1
< 0.1%
199457 1
< 0.1%
199384 1
< 0.1%
199329 1
< 0.1%

보증기간(년)
Real number (ℝ)

High correlation  Zeros 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.9609177
Minimum0
Maximum10
Zeros618
Zeros (%)8.2%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2025-01-13T21:08:47.751745image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q38
95-th percentile10
Maximum10
Range10
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.155342
Coefficient of variation (CV)0.63603998
Kurtosis-1.3757981
Mean4.9609177
Median Absolute Deviation (MAD)3
Skewness-0.036224811
Sum37192
Variance9.9561831
MonotonicityNot monotonic
2025-01-13T21:08:47.883626image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2 1358
18.1%
7 1093
14.6%
8 1073
14.3%
0 618
8.2%
1 552
7.4%
10 522
 
7.0%
9 515
 
6.9%
3 494
 
6.6%
5 428
 
5.7%
4 426
 
5.7%
ValueCountFrequency (%)
0 618
8.2%
1 552
7.4%
2 1358
18.1%
3 494
 
6.6%
4 426
 
5.7%
5 428
 
5.7%
6 418
 
5.6%
7 1093
14.6%
8 1073
14.3%
9 515
 
6.9%
ValueCountFrequency (%)
10 522
 
7.0%
9 515
 
6.9%
8 1073
14.3%
7 1093
14.6%
6 418
 
5.6%
5 428
 
5.7%
4 426
 
5.7%
3 494
 
6.6%
2 1358
18.1%
1 552
7.4%

사고이력
Boolean

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
False
7154 
True
 
343
ValueCountFrequency (%)
False 7154
95.4%
True 343
 
4.6%
2025-01-13T21:08:48.019199image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

연식(년)
Categorical

High correlation  Imbalance 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size58.7 KiB
0
6395 
2
 
566
1
 
536

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7497
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 6395
85.3%
2 566
 
7.5%
1 536
 
7.1%

Length

2025-01-13T21:08:48.143561image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-01-13T21:08:48.236165image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6395
85.3%
2 566
 
7.5%
1 536
 
7.1%

Most occurring characters

ValueCountFrequency (%)
0 6395
85.3%
2 566
 
7.5%
1 536
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7497
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6395
85.3%
2 566
 
7.5%
1 536
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 7497
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6395
85.3%
2 566
 
7.5%
1 536
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7497
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6395
85.3%
2 566
 
7.5%
1 536
 
7.1%

가격(백만원)
Real number (ℝ)

High correlation 

Distinct3950
Distinct (%)52.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.331949
Minimum9
Maximum161.09
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2025-01-13T21:08:48.349299image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile22.29
Q134.39
median56
Q380.05
95-th percentile135.8
Maximum161.09
Range152.09
Interquartile range (IQR)45.66

Descriptive statistics

Standard deviation36.646759
Coefficient of variation (CV)0.58792898
Kurtosis0.35800952
Mean62.331949
Median Absolute Deviation (MAD)23.1
Skewness1.0033363
Sum467302.62
Variance1342.985
MonotonicityNot monotonic
2025-01-13T21:08:48.530027image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
160 48
 
0.6%
60 45
 
0.6%
100 43
 
0.6%
39 39
 
0.5%
24 36
 
0.5%
35 34
 
0.5%
38 33
 
0.4%
36 30
 
0.4%
23.5 25
 
0.3%
99 22
 
0.3%
Other values (3940) 7142
95.3%
ValueCountFrequency (%)
9 3
< 0.1%
9.38 1
 
< 0.1%
9.66 1
 
< 0.1%
9.77 1
 
< 0.1%
9.83 1
 
< 0.1%
9.92 1
 
< 0.1%
9.94 1
 
< 0.1%
10.22 1
 
< 0.1%
10.46 1
 
< 0.1%
10.85 1
 
< 0.1%
ValueCountFrequency (%)
161.09 1
 
< 0.1%
161.01 1
 
< 0.1%
160.99 2
< 0.1%
160.96 2
< 0.1%
160.95 1
 
< 0.1%
160.94 1
 
< 0.1%
160.91 1
 
< 0.1%
160.87 2
< 0.1%
160.86 1
 
< 0.1%
160.84 3
< 0.1%

Interactions

2025-01-13T21:08:42.675138image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:38.974142image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:40.123617image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:41.370872image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:42.808764image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:39.429669image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:40.709822image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:41.461317image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:42.951082image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:39.534194image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:40.830231image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:41.897374image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:43.058705image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:40.006881image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:40.934172image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-01-13T21:08:41.993681image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2025-01-13T21:08:48.646778image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
가격(백만원)구동방식모델배터리용량보증기간(년)사고이력연식(년)제조사주행거리(km)차량상태
가격(백만원)1.0000.4310.7920.581-0.2690.0000.1130.595-0.1080.189
구동방식0.4311.0000.8500.3050.3040.0000.0410.6770.0610.026
모델0.7920.8501.0000.4660.3810.0000.1670.9990.1460.306
배터리용량0.5810.3050.4661.0000.4870.0000.3460.376-0.6610.768
보증기간(년)-0.2690.3040.3810.4871.0000.0000.5820.389-0.7070.758
사고이력0.0000.0000.0000.0000.0001.0000.0100.0000.0260.000
연식(년)0.1130.0410.1670.3460.5820.0101.0000.0370.3630.450
제조사0.5950.6770.9990.3760.3890.0000.0371.0000.0340.008
주행거리(km)-0.1080.0610.146-0.661-0.7070.0260.3630.0341.0000.867
차량상태0.1890.0260.3060.7680.7580.0000.4500.0080.8671.000

Missing values

2025-01-13T21:08:43.240088image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2025-01-13T21:08:43.421486image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ID제조사모델차량상태배터리용량구동방식주행거리(km)보증기간(년)사고이력연식(년)가격(백만원)
0TRAIN_0000P사TayGTSNearly New86.077AWD136420No2159.66
1TRAIN_0001K사NiroNearly New56.000FWD101996No028.01
2TRAIN_0002A사eTBrand New91.200AWD23617No066.27
3TRAIN_0003A사RSeTGTNearly NewNaNAWD216833No099.16
4TRAIN_0004B사i5Pre-Owned61.018AWD1782051No062.02
5TRAIN_0005H사ION6Pre-Owned58.162AWD1031003No037.02
6TRAIN_0006T사MSNearly NewNaNAWD193953No083.42
7TRAIN_0007A사RSeTGTNearly New78.227AWD305835No199.66
8TRAIN_0008T사MYBrand NewNaNAWD22268No074.06
9TRAIN_0009A사Q4eTBrand NewNaNAWD36837No059.66
ID제조사모델차량상태배터리용량구동방식주행거리(km)보증기간(년)사고이력연식(년)가격(백만원)
7487TRAIN_7487H사IONIQNearly New67.17FWD290283No111.39
7488TRAIN_7488T사M3Brand NewNaNRWD38397Yes046.46
7489TRAIN_7489H사ION5Brand NewNaNAWD88719No035.83
7490TRAIN_7490A사Q4eTBrand NewNaNAWD57947No059.95
7491TRAIN_7491K사SoulBrand NewNaNFWD596610No016.75
7492TRAIN_7492H사ION5Brand NewNaNAWD377310No035.95
7493TRAIN_7493B사i3Pre-Owned46.00RWD1354112No023.40
7494TRAIN_7494P사TayCTBrand NewNaNAWD13632No0120.00
7495TRAIN_7495B사i3Nearly New56.00RWD394456No224.00
7496TRAIN_7496T사MYPre-Owned51.94AWD802150No074.06